<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic
  PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd">
<topic id="topic1">
  <title id="bm20941">CREATE AGGREGATE</title>
  <body>
    <p id="sql_command_desc">Defines a new aggregate function.</p>
    <section id="section2"
      ><title>Synopsis</title><codeblock id="sql_command_synopsis">CREATE AGGREGATE <varname>name</varname> ( [ <varname>argmode</varname> ] [ <varname>argname</varname> ] <varname>arg_data_type</varname> [ , ... ] ) (
    SFUNC = <varname>statefunc</varname>,
    STYPE = <varname>state_data_type</varname>
    [ , SSPACE = <varname>state_data_size</varname> ]
    [ , FINALFUNC = <varname>ffunc</varname> ]
    [ , FINALFUNC_EXTRA ]
    [ , COMBINEFUNC = <varname>combinefunc</varname> ]
    [ , SERIALFUNC = <varname>serialfunc</varname> ]
    [ , DESERIALFUNC = <varname>deserialfunc</varname> ]
    [ , INITCOND = <varname>initial_condition</varname> ]
    [ , MSFUNC = <varname>msfunc</varname> ]
    [ , MINVFUNC = <varname>minvfunc</varname> ]
    [ , MSTYPE = <varname>mstate_data_type</varname> ]
    [ , MSSPACE = <varname>mstate_data_size</varname> ]
    [ , MFINALFUNC = <varname>mffunc</varname> ]
    [ , MFINALFUNC_EXTRA ]
    [ , MINITCOND = <varname>minitial_condition</varname> ]
    [ , SORTOP = <varname>sort_operator</varname> ]
  )
  
  CREATE AGGREGATE <varname>name</varname> ( [ [ <varname>argmode</varname> ] [ <varname>argname</varname> ] <varname>arg_data_type</varname> [ , ... ] ]
      ORDER BY [ <varname>argmode</varname> ] [ <varname>argname</varname> ] <varname>arg_data_type</varname> [ , ... ] ) (
    SFUNC = <varname>statefunc</varname>,
    STYPE = <varname>state_data_type</varname>
    [ , SSPACE = <varname>state_data_size</varname> ]
    [ , FINALFUNC = <varname>ffunc</varname> ]
    [ , FINALFUNC_EXTRA ]
    [ , COMBINEFUNC = <varname>combinefunc</varname> ]
    [ , SERIALFUNC = <varname>serialfunc</varname> ]
    [ , DESERIALFUNC = <varname>deserialfunc</varname> ]
    [ , INITCOND = <varname>initial_condition</varname> ]
    [ , HYPOTHETICAL ]
  )
  
  or the old syntax
  
  CREATE AGGREGATE <varname>name</varname> (
    BASETYPE = <varname>base_type</varname>,
    SFUNC = <varname>statefunc</varname>,
    STYPE = <varname>state_data_type</varname>
    [ , SSPACE = <varname>state_data_size</varname> ]
    [ , FINALFUNC = <varname>ffunc</varname> ]
    [ , FINALFUNC_EXTRA ]
    [ , COMBINEFUNC = <varname>combinefunc</varname> ]
    [ , SERIALFUNC = <varname>serialfunc</varname> ]
    [ , DESERIALFUNC = <varname>deserialfunc</varname> ]
    [ , INITCOND = <varname>initial_condition</varname> ]
    [ , MSFUNC = <varname>msfunc</varname> ]
    [ , MINVFUNC = <varname>minvfunc</varname> ]
    [ , MSTYPE = <varname>mstate_data_type</varname> ]
    [ , MSSPACE = <varname>mstate_data_size</varname> ]
    [ , MFINALFUNC = <varname>mffunc</varname> ]
    [ , MFINALFUNC_EXTRA ]
    [ , MINITCOND = <varname>minitial_condition</varname> ]
    [ , SORTOP = <varname>sort_operator</varname> ]
  )</codeblock></section>
    <section id="section3">
      <title>Description</title>
      <p><codeph>CREATE AGGREGATE</codeph> defines a new aggregate function. Some basic and
        commonly-used aggregate functions such as <codeph>count</codeph>, <codeph>min</codeph>,
          <codeph>max</codeph>, <codeph>sum</codeph>, <codeph>avg</codeph> and so on are already
        provided in Greenplum Database. If you define new types or need an aggregate function not
        already provided, you can use <codeph>CREATE AGGREGATE</codeph> to provide the desired
        features.</p>
      <p>If a schema name is given (for example, <codeph>CREATE AGGREGATE myschema.myagg
          ...</codeph>) then the aggregate function is created in the specified schema. Otherwise it
        is created in the current schema. </p>
      <p>An aggregate function is identified by its name and input data types. Two aggregate
        functions in the same schema can have the same name if they operate on different input
        types. The name and input data types of an aggregate function must also be distinct from the
        name and input data types of every ordinary function in the same schema. This behavior is
        identical to overloading of ordinary function names. See <codeph><xref
            href="CREATE_FUNCTION.xml#topic1"/></codeph>.</p>
      <p>A simple aggregate function is made from one, two, or three ordinary functions (which must
        be <codeph>IMMUTABLE</codeph> functions): </p>
      <ul id="ul_d5c_5yl_dhb">
        <li>a state transition function <varname>statefunc</varname></li>
        <li>an optional final calculation function <varname>ffunc</varname></li>
        <li>an optional combine function <varname>combinefunc</varname></li>
      </ul>
      <p>These functions are used as follows:</p>
      <codeblock><varname>statefunc</varname>( internal-state, next-data-values ) ---&gt; next-internal-state
<varname>ffunc</varname>( internal-state ) ---&gt; aggregate-value
<varname>combinefunc</varname>( internal-state, internal-state ) ---&gt; next-internal-state</codeblock>
      <p>Greenplum Database creates a temporary variable of data type
          <varname>state_data_type</varname> to hold the current internal state of the aggregate
        function. At each input row, the aggregate argument values are calculated and the state
        transition function is invoked with the current state value and the new argument values to
        calculate a new internal state value. After all the rows have been processed, the final
        function is invoked once to calculate the aggregate return value. If there is no final
        function then the ending state value is returned as-is.</p>
      <note>If you write a user-defined aggregate in C, and you declare the state value
          (<varname>state_data_type</varname>) as type <codeph>internal</codeph>, there is a risk of
        an out-of-memory error occurring. If <codeph>internal</codeph> state values are not properly
        managed and a query acquires too much memory for state values, an out-of-memory error could
        occur. To prevent this, use <codeph>mpool_alloc(<varname>mpool</varname>,
            <varname>size</varname>)</codeph> to have Greenplum manage and allocate memory for
        non-temporary state values, that is, state values that have a lifespan for the entire
        aggregation. The argument <codeph><varname>mpool</varname></codeph> of the
          <codeph>mpool_alloc()</codeph> function is
          <codeph>aggstate->hhashtable->group_buf</codeph>. For an example, see the implementation
        of the numeric data type aggregates in <codeph>src/backend/utils/adt/numeric.c</codeph> in
        the Greenplum Database open source code.</note>
      <p>You can specify <codeph><varname>combinefunc</varname></codeph> as a method for optimizing
        aggregate execution. By specifying <codeph><varname>combinefunc</varname></codeph>, the
        aggregate can be run in parallel on segments first and then on the master. When a
        two-level execution is performed, the <codeph><varname>statefunc</varname></codeph> is
        run on the segments to generate partial aggregate results, and
            <codeph><varname>combinefunc</varname></codeph> is run on the master to aggregate
        the partial results from segments. If single-level aggregation is performed, all the rows
        are sent to the master and the <codeph><varname>statefunc</varname></codeph> is applied to
        the rows.</p>
      <p>Single-level aggregation and two-level aggregation are equivalent execution strategies.
        Either type of aggregation can be implemented in a query plan. When you implement the
        functions <codeph>combinefunc</codeph> and <codeph>statefunc</codeph>, you must ensure that
        the invocation of the <codeph><varname>statefunc</varname></codeph> on the segment instances
        followed by <codeph>combinefunc</codeph> on the master produce the same result as
        single-level aggregation that sends all the rows to the master and then applies only the
            <codeph><varname>statefunc</varname></codeph>  to the rows.</p>
      <p>An aggregate function can provide an optional initial condition, an initial value for the
        internal state value. This is specified and stored in the database as a value of type
        <codeph>text</codeph>,
        but it must be a valid external representation of a constant of the state value data type.
        If it is not supplied then the state value starts out <codeph>NULL</codeph>. </p>
      <p>If  <codeph><varname>statefunc</varname></codeph>  is declared <codeph>STRICT</codeph>,
        then it cannot be called with <codeph>NULL</codeph> inputs. With such a transition function,
        aggregate execution behaves as follows. Rows with any null input values are ignored (the
        function is not called and the previous state value is retained). If the initial state value
        is <codeph>NULL</codeph>, then at the first row with all non-null input values, the first
        argument value replaces the state value, and the transition function is invoked at
        subsequent rows with all non-null input values. This is useful for implementing aggregates
        like <codeph>max</codeph>. Note that this behavior is only available when
          <varname>state_data_type</varname> is the same as the first
          <varname>arg_data_type</varname>. When these types are different, you must supply a
        non-null initial condition or use a nonstrict transition function.</p>
      <p>If  <varname>statefunc</varname> is not declared <codeph>STRICT</codeph>, then it will be
        called unconditionally at each input row, and must deal with <codeph>NULL</codeph> inputs
        and <codeph>NULL</codeph> state values for itself. This allows the aggregate author to
        have full control over the aggregate's handling of <codeph>NULL</codeph> values.</p>
      <p>If the final function (<codeph><varname>ffunc</varname></codeph>) is declared
          <codeph>STRICT</codeph>, then it will not be called when the ending state value is
          <codeph>NULL</codeph>; instead a <codeph>NULL</codeph> result will be returned
        automatically. (This is the normal behavior of <codeph>STRICT</codeph> functions.) In any
        case the final function has the option of returning a <codeph>NULL</codeph> value. For
        example, the final function for <codeph>avg</codeph> returns <codeph>NULL</codeph> when it
        sees there were zero input rows.</p>
      <p> Sometimes it is useful to declare the final function as taking not just the state value,
        but extra parameters corresponding to the aggregate's input values. The main reason for
        doing this is if the final function is polymorphic and the state value's data type would be
        inadequate to pin down the result type. These extra parameters are always passed as
          <codeph>NULL</codeph> (and so the final function must not be strict when the
          <codeph>FINALFUNC_EXTRA</codeph> option is used), but nonetheless they are valid
        parameters. The final function could for example make use of
          <codeph>get_fn_expr_argtype</codeph> to identify the actual argument type in the current
        call. </p>
      <p> An aggregate can optionally support <i>moving-aggregate mode</i>, as described in <xref
          href="https://www.postgresql.org/docs/9.4/xaggr.html#XAGGR-MOVING-AGGREGATES"
          scope="external" format="html">Moving-Aggregate Mode</xref> in the PostgreSQL
        documentation. This requires specifying the <codeph><varname>msfunc</varname></codeph>,
            <codeph><varname>minvfunc</varname></codeph>, and
          <codeph><varname>mstype</varname></codeph> functions, and optionally the
            <codeph><varname>mspace</varname></codeph>,
          <codeph><varname>mfinalfunc</varname></codeph>,
            <codeph><varname>mfinalfunc_extra</varname></codeph>, and
            <codeph><varname>minitcond</varname></codeph> functions. Except for
            <codeph><varname>minvfunc</varname></codeph>, these functions work like the
        corresponding simple-aggregate functions without <codeph><varname>m</varname></codeph>; they
        define a separate implementation of the aggregate that includes an inverse transition
        function. </p>
      <p>The syntax with <codeph>ORDER BY</codeph> in the parameter list creates a special type of
        aggregate called an <i>ordered-set aggregate</i>; or if <codeph>HYPOTHETICAL</codeph> is
        specified, then a <i>hypothetical-set aggregate</i> is created. These aggregates operate
        over groups of sorted values in order-dependent ways, so that specification of an input sort
        order is an essential part of a call. Also, they can have <i>direct</i> arguments, which are
        arguments that are evaluated only once per aggregation rather than once per input row.
        Hypothetical-set aggregates are a subclass of ordered-set aggregates in which some of the
        direct arguments are required to match, in number and data types, the aggregated argument
        columns. This allows the values of those direct arguments to be added to the collection of
        aggregate-input rows as an additional "hypothetical" row. </p>
      <p>Single argument aggregate functions, such as <codeph>min</codeph> or
        <codeph>max</codeph>, can sometimes be optimized by
        looking into an index instead of scanning every input row. If this aggregate can be so
        optimized, indicate it by specifying a <i>sort operator</i>. The basic requirement is that the
        aggregate must yield the first element in the sort ordering induced by the operator; in
        other words:</p>
      <codeblock>SELECT <varname>agg</varname>(<varname>col</varname>) FROM <varname>tab</varname>; </codeblock>
      <p>must be equivalent to:</p>
      <codeblock>SELECT <varname>col</varname> FROM <varname>tab</varname> ORDER BY <varname>col</varname> USING <varname>sortop</varname> LIMIT 1;</codeblock>
      <p>Further assumptions are that the aggregate function ignores <codeph>NULL</codeph> inputs,
        and that it delivers a <codeph>NULL</codeph> result if and only if there were no non-null
        inputs. Ordinarily, a data type's <codeph>&lt;</codeph> operator is the proper sort operator
        for <codeph>MIN</codeph>, and <codeph>&gt;</codeph> is the proper sort operator for
          <codeph>MAX</codeph>. Note that the optimization will never actually take effect unless
        the specified operator is the "less than" or "greater than" strategy member of a B-tree
        index operator class.</p>
      <p> To be able to create an aggregate function, you must have <codeph>USAGE</codeph> privilege
        on the argument types, the state type(s), and the return type, as well as
          <codeph>EXECUTE</codeph> privilege on the transition and final functions. </p>
    </section>
    <section id="section5">
      <title>Parameters</title>
      <parml>
        <plentry>
          <pt><varname>name</varname></pt>
          <pd>The name (optionally schema-qualified) of the aggregate function to create.</pd>
        </plentry>
        <plentry>
          <pt><varname>argmode</varname></pt>
          <pd>The mode of an argument: <codeph>IN</codeph> or <codeph>VARIADIC</codeph>. (Aggregate
            functions do not support <codeph>OUT</codeph> arguments.) If omitted, the default is
              <codeph>IN</codeph>. Only the last argument can be marked
            <codeph>VARIADIC</codeph>.</pd>
        </plentry>
        <plentry>
          <pt><varname>argname</varname></pt>
          <pd>The name of an argument. This is currently only useful for documentation purposes. If
            omitted, the argument has no name.</pd>
        </plentry>
        <plentry>
          <pt><varname>arg_data_type</varname></pt>
          <pd>An input data type on which this aggregate function operates. To create a
            zero-argument aggregate function, write <codeph>*</codeph> in place of the list of
            argument specifications. (An example of such an aggregate is <codeph>count(*)</codeph>.)
          </pd>
        </plentry>
        <plentry>
          <pt><varname>base_type</varname></pt>
          <pd>In the old syntax for <codeph>CREATE AGGREGATE</codeph>, the input data type is
            specified by a <codeph>basetype</codeph> parameter rather than being written next to the
            aggregate name. Note that this syntax allows only one input parameter. To define a
            zero-argument aggregate function with this syntax, specify the <codeph>basetype</codeph>
            as <codeph>"ANY"</codeph> (not <codeph>*</codeph>). Ordered-set aggregates cannot be
            defined with the old syntax. </pd>
        </plentry>
        <plentry>
          <pt><varname>statefunc</varname></pt>
          <pd>The name of the state transition function to be called for each input row. For a
            normal N-argument aggregate function,  the state transition function
                <codeph><varname>statefunc</varname></codeph> must take N+1 arguments, the first
            being of type <varname>state_data_type</varname> and the rest matching the declared
            input data types of the aggregate. The function must return a value of type
              <varname>state_data_type</varname>. This function takes the current state value and
            the current input data values, and returns the next state value.</pd>
          <pd>For ordered-set (including hypothetical-set) aggregates, the state transition function
              <varname>statefunc</varname> receives only the current state value and the aggregated
            arguments, not the direct arguments. Otherwise it is the same. </pd>
        </plentry>
        <plentry>
          <pt><varname>state_data_type</varname></pt>
          <pd>The data type for the aggregate's state value.</pd>
        </plentry>
        <plentry>
          <pt><varname>state_data_size</varname></pt>
          <pd>The approximate average size (in bytes) of the aggregate's state value. If this
            parameter is omitted or is zero, a default estimate is used based on the
              <varname>state_data_type</varname>. The planner uses this value to estimate the memory
            required for a grouped aggregate query. Large values of this parameter discourage use of
            hash aggregation.</pd>
        </plentry>
        <plentry>
          <pt><varname>ffunc</varname></pt>
          <pd>The name of the final function called to compute the aggregate result after all input
            rows have been traversed. The function must take a single argument of type
              <varname>state_data_type</varname>. The return data type of the aggregate is defined
            as the return type of this function. If <codeph><varname>ffunc</varname></codeph> is not
            specified, then the ending state value is used as the aggregate result, and the return
            type is <varname>state_data_type</varname>. </pd>
          <pd>For ordered-set (including hypothetical-set) aggregates, the final function receives
            not only the final state value, but also the values of all the direct arguments.</pd>
          <pd>If <codeph>FINALFUNC_EXTRA</codeph> is specified, then in addition to the final state
            value and any direct arguments, the final function receives extra NULL values
            corresponding to the aggregate's regular (aggregated) arguments. This is mainly useful
            to allow correct resolution of the aggregate result type when a polymorphic aggregate is
            being defined. </pd>
        </plentry>
        <plentry>
          <pt><varname>combinefunc</varname></pt>
          <pd>The name of a combine function. This is a function of two arguments, both of type
              <varname>state_data_type</varname>. It must return a value of
              <varname>state_data_type</varname>. A combine function takes two transition state
            values and returns a new transition state value representing the combined aggregation.
            In Greenplum Database, if the result of the aggregate function is computed in a
            segmented fashion, the combine function is invoked on the individual internal states in
            order to combine them into an ending internal state.</pd>
          <pd>Note that this function is also called in hash aggregate mode within a segment.
            Therefore, if you call this aggregate function without a combine function, hash
            aggregate is never chosen. Since hash aggregate is efficient, consider defining a
            combine function whenever possible.</pd>
        </plentry>
        <plentry>
          <pt><varname>serialfunc</varname></pt>
          <pd> An aggregate function whose <varname>state_data_type</varname> is
              <codeph>internal</codeph> can participate in parallel aggregation only if it has a
              <varname>serialfunc</varname> function, which must serialize the aggregate state into
            a <codeph>bytea</codeph> value for transmission to another process. This function must
            take a single argument of type <codeph>internal</codeph> and return type
              <codeph>bytea</codeph>. A corresponding <varname>deserialfunc</varname> is also
            required. </pd>
        </plentry>
        <plentry>
          <pt><varname>deserialfunc</varname></pt>
          <pd> Deserialize a previously serialized aggregate state back into
              <varname>state_data_type</varname>. This function must take two arguments of types
              <codeph>bytea</codeph> and <codeph>internal</codeph>, and produce a result of type
              <codeph>internal</codeph>. (Note: the second, <codeph>internal</codeph> argument is
            unused, but is required for type safety reasons.) </pd>
        </plentry>
        <plentry>
          <pt><varname>initial_condition</varname></pt>
          <pd> The initial setting for the state value. This must be a string constant in the form
            accepted for the data type <varname>state_data_type</varname>. If not specified, the
            state value starts out null. </pd>
        </plentry>
        <plentry>
          <pt><varname>msfunc</varname></pt>
          <pd> The name of the forward state transition function to be called for each input row in
            moving-aggregate mode. This is exactly like the regular transition function, except that
            its first argument and result are of type <varname>mstate_data_type</varname>, which
            might be different from <varname>state_data_type</varname>. </pd>
        </plentry>
        <plentry>
          <pt><varname>minvfunc</varname></pt>
          <pd> The name of the inverse state transition function to be used in moving-aggregate
            mode. This function has the same argument and result types as <varname>msfunc</varname>,
            but it is used to remove a value from the current aggregate state, rather than add a
            value to it. The inverse transition function must have the same strictness attribute as
            the forward state transition function. </pd>
        </plentry>
        <plentry>
          <pt><varname>mstate_data_type</varname></pt>
          <pd> The data type for the aggregate's state value, when using moving-aggregate mode.
          </pd>
        </plentry>
        <plentry>
          <pt><varname>mstate_data_size</varname></pt>
          <pd> The approximate average size (in bytes) of the aggregate's state value, when using
            moving-aggregate mode. This works the same as <varname>state_data_size</varname>. </pd>
        </plentry>
        <plentry>
          <pt><varname>mffunc</varname></pt>
          <pd> The name of the final function called to compute the aggregate's result after all
            input rows have been traversed, when using moving-aggregate mode. This works the same as
              <varname>ffunc</varname>, except that its first argument's type is
              <varname>mstate_data_type</varname> and extra dummy arguments are specified by writing
              <codeph>MFINALFUNC_EXTRA</codeph>. The aggregate result type determined by
              <varname>mffunc</varname> or <varname>mstate_data_type</varname> must match that
            determined by the aggregate's regular implementation. </pd>
        </plentry>
        <plentry>
          <pt><varname>minitial_condition</varname></pt>
          <pd> The initial setting for the state value, when using moving-aggregate mode. This works
            the same as <varname>initial_condition</varname>. </pd>
        </plentry>
        <plentry>
          <pt><varname>sort_operator</varname></pt>
          <pd> The associated sort operator for a <codeph>MIN</codeph>- or <codeph>MAX</codeph>-like
            aggregate. This is just an operator name (possibly schema-qualified). The operator is
            assumed to have the same input data types as the aggregate (which must be a
            single-argument normal aggregate). </pd>
        </plentry>
        <plentry>
          <pt><varname>HYPOTHETICAL</varname></pt>
          <pd> For ordered-set aggregates only, this flag specifies that the aggregate arguments are
            to be processed according to the requirements for hypothetical-set aggregates: that is,
            the last few direct arguments must match the data types of the aggregated
              (<codeph>WITHIN GROUP</codeph>) arguments. The <codeph>HYPOTHETICAL</codeph> flag has
            no effect on run-time behavior, only on parse-time resolution of the data types and
            collations of the aggregate's arguments. </pd>
        </plentry>
      </parml>
    </section>
    <section id="section6">
      <title>Notes</title>
      <p>The ordinary functions used to define a new aggregate function must be defined first. Note
        that in this release of Greenplum Database, it is required that the
          <varname>statefunc</varname>, <varname>ffunc</varname>, and <varname>combinefunc</varname>
        functions used to create the aggregate are defined as <codeph>IMMUTABLE</codeph>.</p>
      <p>If the value of the Greenplum Database server configuration parameter
          <codeph>gp_enable_multiphase_agg</codeph> is <codeph>off</codeph>, only single-level
        aggregation is performed. </p>
      <p>Any compiled code (shared library files) for custom functions must be placed in the same
        location on every host in your Greenplum Database array (master and all segments). This
        location must also be in the <codeph>LD_LIBRARY_PATH</codeph> so that the server can locate
        the files.</p>
      <p>In previous versions of Greenplum Database, there was a concept of ordered aggregates.
        Since version 6, any aggregate can be called as an ordered aggregate, using the syntax:
        <codeblock>name ( arg [ , ... ] [ORDER BY sortspec [ , ...]] )</codeblock></p>
      <p>The <codeph>ORDERED</codeph> keyword is accepted for backwards compatibility, but is
        ignored.</p>
      <p>In previous versions of Greenplum Database, the <codeph>COMBINEFUNC</codeph> option was
        called <codeph>PREFUNC</codeph>. It is still accepted for backwards compatibility, as a
        synonym for <codeph>COMBINEFUNC</codeph>.</p>
    </section>
    <section id="section7">
      <title>Example</title>
      <p>The following simple example creates an aggregate function that computes the sum of two
        columns. </p>
      <p>Before creating the aggregate function, create two functions that are used as the
            <codeph><varname>statefunc</varname></codeph>  and
            <codeph><varname>combinefunc</varname></codeph> functions of the aggregate function. </p>
      <p>This function is specified as the <codeph><varname>statefunc</varname></codeph> function in
        the aggregate function.</p>
      <codeblock>CREATE FUNCTION mysfunc_accum(numeric, numeric, numeric) 
  RETURNS numeric
   AS 'select $1 + $2 + $3'
   LANGUAGE SQL
   IMMUTABLE
   RETURNS NULL ON NULL INPUT;</codeblock>
      <p>This function is specified as the <codeph><codeph>combinefunc</codeph></codeph> function in
        the aggregate function.</p>
      <codeblock>CREATE FUNCTION mycombine_accum(numeric, numeric )
  RETURNS numeric
   AS 'select $1 + $2'
   LANGUAGE SQL
   IMMUTABLE
   RETURNS NULL ON NULL INPUT;</codeblock>
      <p>This <codeph>CREATE AGGREGATE</codeph> command creates the aggregate function that adds two
        columns. </p>
      <codeblock>CREATE AGGREGATE agg_prefunc(numeric, numeric) (
   SFUNC = mysfunc_accum,
   STYPE = numeric,
   COMBINEFUNC = mycombine_accum,
   INITCOND = 0 );</codeblock>
      <p>The following commands create a table, adds some rows, and runs the aggregate function.</p>
      <codeblock>create table t1 (a int, b int) DISTRIBUTED BY (a);
insert into t1 values
   (10, 1),
   (20, 2),
   (30, 3);
select agg_prefunc(a, b) from t1;</codeblock>
      <p>This <codeph>EXPLAIN</codeph> command shows two phase aggregation. </p>
      <codeblock>explain select agg_prefunc(a, b) from t1;

QUERY PLAN
-------------------------------------------------------------------------- 
Aggregate (cost=1.10..1.11 rows=1 width=32)  
 -&gt; Gather Motion 2:1 (slice1; segments: 2) (cost=1.04..1.08 rows=1
      width=32)
     -&gt; Aggregate (cost=1.04..1.05 rows=1 width=32)
       -&gt; Seq Scan on t1 (cost=0.00..1.03 rows=2 width=8)
 Optimizer: Pivotal Optimizer (GPORCA)
 (5 rows)</codeblock>
    </section>
    <section id="section8">
      <title>Compatibility</title>
      <p><codeph>CREATE AGGREGATE</codeph> is a Greenplum Database language extension. The SQL
        standard does not provide for user-defined aggregate functions.</p>
    </section>
    <section id="section9">
      <title>See Also</title>
      <p><codeph><xref href="ALTER_AGGREGATE.xml#topic1" type="topic" format="dita"/></codeph>,
            <codeph><xref href="./DROP_AGGREGATE.xml#topic1" type="topic" format="dita"/></codeph>,
            <codeph><xref href="./CREATE_FUNCTION.xml#topic1" type="topic" format="dita"
        /></codeph></p>
    </section>
  </body>
</topic>
